PyDamage: automated ancient damage identification and estimation for contigs in ancient DNA <i>de novo</i> assembly
نویسندگان
چکیده
DNA de novo assembly can be used to reconstruct longer stretches of (contigs), including genes and even genomes, from short sequencing reads. Applying this technique metagenomic data derived archaeological remains, such as paleofeces dental calculus, we investigate past microbiome functional diversity that may absent or underrepresented in the modern gene catalogue. However, compared samples, ancient samples are often burdened with environmental contamination, resulting datasets represent mixtures DNA. The ability rapidly reliably establish authenticity integrity is essential for studies, distinguish between sequences particularly important studies. Characteristic patterns damage, namely fragmentation cytosine deamination (observed C-to-T transitions) typically authenticate sequences, but existing tools inspecting filtering aDNA damage either compute it at read level, which leads high loss lower quality when combination assembly, require manual inspection, impractical assemblies contain tens hundreds thousands contigs. To address these challenges, designed PyDamage, a robust, automated approach estimation authentication assembled aDNA. PyDamage uses likelihood ratio based discriminate truly contigs originating contamination. We test on both simulated paleofeces, demonstrate its automatically identify bearing characteristic Coupled Pydamage opens up new doors explore datasets.
منابع مشابه
Bayesian estimation of sequence damage in ancient DNA.
DNA extracted from archaeological and paleontological remains is usually damaged by biochemical processes postmortem. Some of these processes lead to changes in the structure of the DNA molecule, which can result in the incorporation of incorrect nucleotides during polymerase chain reaction. These base misincorporations, or miscoding lesions, can lead to the inclusion of spurious additional mut...
متن کاملImproving ancient DNA genome assembly
Most reconstruction methods for genomes of ancient origin that are used today require a closely related reference. In order to identify genomic rearrangements or the deletion of whole genes, de novo assembly has to be used. However, because of inherent problems with ancient DNA, its de novo assembly is highly complicated. In order to tackle the diversity in the length of the input reads, we pro...
متن کاملAncient DNA from Human and Animal
Research on ancient DNA (aDNA) has the potential to enable molecular biologists and archeologists to decipher certain aspects of history by direct looking into the past. However, several major problems in this field limit the applicability of aDNA studies, most importantly contamination with modern DNA and postmortem DNA degradation. In this study we extracted and analyzed aDNA obtained from ~3...
متن کاملFPSAC: fast phylogenetic scaffolding of ancient contigs
MOTIVATIONS Recent progress in ancient DNA sequencing technologies and protocols has lead to the sequencing of whole ancient bacterial genomes, as illustrated by the recent sequence of the Yersinia pestis strain that caused the Black Death pandemic. However, sequencing ancient genomes raises specific problems, because of the decay and fragmentation of ancient DNA among others, making the scaffo...
متن کاملDNA damage and DNA sequence retrieval from ancient tissues.
Gas chromatography/mass spectrometry (GC/MS) was used to determine the amounts of eight oxidative base modifications in DNA extracted from 11 specimens of bones and soft tissues, ranging in age from 40 to >50 000 years. Among the compounds assayed hydantoin derivatives of pyrimidines were quantitatively dominant. From five of the specimens endogenous ancient DNA sequences could be amplified by ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PeerJ
سال: 2021
ISSN: ['2167-8359']
DOI: https://doi.org/10.7717/peerj.11845